17-1 HTK Introduction (HTK 簡)

Old Chinese version

HTK (Hidden Markov Model Toolkit) is a public-domain software for training HMM (Hidden Markov Models), mostly for the application of automatic speech recognition. Most of the related information can be found at the HTK website:

http://htk.eng.cam.ac.uk/
HTK was originally developed by the Machine Intelligence Lab of Engineering Department at Cambridge University. In 1999, Microsoft bought HTK from its owner Entropic Inc., and made it public-domain open-source software for enhancing the speech technology through collective efforts. As a result, now we can download the source of HTK from its website directly.

The implementation of automatic speech recognition involves advanced techniques of HMM training and evaluation, which are not easily mastered by common programmers. Since the availability of HTK source code, the entry barrier became lower, which advances the research of speech technology rapidly. Currently most of the research labs in unversities and industry are using HTK for their research and development. As a result, HTK is the de facto standard tool for developing automatic speech recognition.

This chapter will use several basic examples to demonstrate the use of HTK. We have tried to keep the examples as simple as possible while not omitting any important features of HTK. After getting familiar with these examples, the users can further consult HTK manual for other advanced features for your own research and development.


Audio Signal Processing and Recognition (音訊處理與辨識)